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The authors consider the interesting and important issue of Bayesian inference based 
on objective functions other than the likelihood. They focus on model selection in the 
low-dimensional setting using prequential local proper scoring rules. 


1 General non-likelihood-based inference 

There is a large and disparate literature on inference based on objective functions other 
than the likelihood. We will briefly mention some examples here, but we believe that a 
more thorough review and comparison would be a worthy endeavor. 

Numerous objective functions have been proposed to replace the (log-)likelihood in 
pursuit of various inference goals. Proper scoring rules are a natural choice for serving as 
such objective functions, due to their property of being minimized (in expectation) under 
the true model. Depending on the goal of the analysis, certain well-known proper scoring 
rules can achieve robustness (e.g., continuous ranked probability score, or CRPS), have 
simple closed-form expressions (e.g., Dawid-Sebastiani score), or do not require densities 
(e.g., CRPS) or normalizing constants (e.g., Hyvarinen score, as in the present paper). 
See Gneiting and Katzfuss (2014) for a recent review of these and other scoring rules. 

In a frequentist context, examples of approaches falling into this category of scoring- 
rule-based inference are minimum contrast estimation (e.g., Pfanzagl, 1969; Birge and 
Massart, 1993), composite likelihood (e.g., Lindsay, 1988), and M-estiination (e.g., Hu¬ 
ber and Ronchetti, 2009). Some further review is given in Dawid et al. (2014). 

There have also been related approaches in the Bayesian framework. Shaby (2014) 
provides a nice review of Bayesian inference using general objective functions and, based 
on results of Chernozhukov and Hong (2003), he proposes an “open-faced sandwich ad¬ 
justment” to obtain pseudo-posteriors with properly calibrated frequentist properties. 
Further, the “Gibbs posterior” (Jiang and Tanner, 2008; Li et ah, 2013) has received 
considerable interest, where the negative log-likelihood is replaced by some “empiri¬ 
cal risk” (usually targeting the specific parameter to be estimated) to construct a 
pseudo-posterior of the form 


(5(6*) oc exp{-Ai?„(0)}7r(6>), (1) 

where A is a positive scaling constant (often called “temperature”). Sampling from the 
pseudo-posterior Q can be performed via standard MGMG algorithms. 
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2 Objective Bayesian model selection 

In objective Bayesian model selection, a discrete prior is assumed on a (finite) class of 
models, and given a particular model, objective improper priors are placed on the model 
parameters. While improper priors are commonly used for analysis of a single model, one 
faces difficulties in comparing models via Bayes factors, since the marginal likelihoods 
of the competing models are only specified up to arbitrary constants. A number of 
remedies have been proposed in the literature to deal with this issue, such as fractional 
Bayes factors (O’Hagan, 1995) and intrinsic Bayes factors (Berger and Pericchi, 1996). 

In the present paper, the authors take a different approach, which relies on replacing 
the (log-)marginal likelihood by a local proper scoring rule. The Hyvarinen score is 
recommended as a default. From the expression of the Hyvarinen score in the authors’ 
equation (16), it can be seen that the arbitrary constant disappears. The authors look at 
examples where the Hyvarinen scores are analytically tractable and provide asymptotic 
orders for the difference in Hyvarinen scores assuming the respective models to be true. 

Some clarification regarding practical implementation of the model selection proce¬ 
dure presented here would be helpful. When can we be sure that one model is truly 
better than another — or in other words, can anything be said about posterior model 
probabilities (also see Section 3 below)? Can the the necessary quantities be computed 
for models beyond the simple Gaussian examples considered in the paper? 

3 Scaling issues 

As indicated in (1) above, the literature on Gibbs posteriors typically includes a mul¬ 
tiplicative scaling constant A on the objective function. The choice of A is considered 
a critical issue, as it has a direct effect on the (pseudo-)posterior uncertainty. Shaby 
(2014) does not consider a multiplicative scaling of the objective function, but his open- 
face-sandwich correction automatically adjusts for such scaling, and his approach is thus 
invariant to scaling. Without such a correction, the scaling issue also arises when the 
objective function is specified to be a proper scoring rule, including the Hyvarinen score. 
As implicitly acknowledged by the authors in their Footnote 2, the scaling of a proper 
scoring rule is arbitrary, in that any proper scoring rule is still proper when multiplied 
by a constant. 

In the context of model selection between models Mi and M 2 with scores S'mi and 
Sm 2 j respectively, the scaling constant can arbitrarily inflate or deflate the pseudo Bayes 
factor, 


exp(AS;^ 

exp(AS'M2) 


and thus the amount of evidence in favor of Mi over M 2 (cf. Kass and Raftery, 1995). 
This also makes it challenging to compute pseudo posterior model probabilities, such as 



( 2 ) 
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If Sm-i is larger than S'm 2 , (2) can be arbitrarily close to 0.5 or 1 by choosing A to be 
very small or very large, respectively. 

In light of these scaling issues, how should model selection be calibrated and inter¬ 
preted? Moreover, is it possible to handle more than two competing models or even 
high-dimensional settings, where the number of competing models may grow exponen¬ 
tially with the sample size? In the high-dimensional linear regression context, Johnson 
and Rossell (2012) showed that a number of commonly used procedures (including frac¬ 
tional and intrinsic Bayes factors) assign vanishingly small posterior probabilities to the 
true model with increasing sample size. The scaling issue may assume an even more 
important role in such cases. 
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